rank | frequency | n-gram |
---|---|---|
1 | 177922 | -a |
2 | 126079 | -e |
3 | 111416 | -i |
4 | 82537 | -y |
5 | 78300 | -m |
rank | frequency | n-gram |
---|---|---|
1 | 46315 | -ch |
2 | 42629 | -ie |
3 | 34534 | -go |
4 | 33860 | -em |
5 | 33801 | -ki |
rank | frequency | n-gram |
---|---|---|
1 | 33451 | -ego |
2 | 23631 | -ych |
3 | 18455 | -nie |
4 | 17744 | -ski |
5 | 13930 | -ami |
rank | frequency | n-gram |
---|---|---|
1 | 13008 | -iego |
2 | 11355 | -nych |
3 | 8904 | -nego |
4 | 7982 | -kiej |
5 | 7339 | -anie |
rank | frequency | n-gram |
---|---|---|
1 | 10905 | -kiego |
2 | 6075 | -owych |
3 | 6057 | -skiej |
4 | 5102 | -skich |
5 | 5035 | -owego |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings